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ABSTRACT 

Data  Farming  combines  the  rapid  prototyping  capability 
inherent  in  certain  simulation  models  with  the  exploratory 
power  of  high  performance  computing  to  rapidly  generate 
insight  into  questions.  The  Data  Farming  process  focuses 
on  a  more  complete  landscape  of  possible  system  re¬ 
sponses,  rather  than  attempting  to  pinpoint  an  answer. 
Data  Farming  allows  decision  makers  to  more  fully  under¬ 
stand  the  landscape  of  possibilities  and  also  consider  out¬ 
liers  that  may  be  discovered.  Over  the  past  decade,  an  in¬ 
ternational  community  has  formed  around  these  ideas.  In 
2008,  International  Data  Farming  Workshop  16  took  place 
in  Monterey,  California,  USA  and  workshop  number  17 
was  held  in  Garmisch  Partenkirchen,  Germany.  In  addition 
to  a  summary  of  these  two  workshops,  this  paper  will  pre¬ 
sent  an  overview  of  the  process  that  has  developed  to  in¬ 
clude  the  development  of  both  methods  and  applications  in 
the  International  Data  Farming  Community. 

1  INTRODUCTION 

Data  Farming  was  first  introduced  to  the  Winter  Simula¬ 
tion  community  at  the  1999  Winter  Simulation  Conference 
in  Phoenix  (Horne  1999).  The  ideas  behind  Data  Farming 
had  been  initially  developed  much  earlier,  but  were  intro¬ 
duced  by  Dr.  Horne  to  the  defense  community  in  1997 
(Horne  1997)  in  concert  with  the  combination  of  agent- 
based  models  with  high  performance  computing  that  was 
the  start  of  Project  Albert. 

Project  Albert  was  a  congressionally  funded  modeling 
and  simulation  initiative  of  the  United  States  Marine  Corps 
(USMC)  motivated  by  the  fact  that  complex  adaptive  sys¬ 
tems  are  pervasive  in  USMC  operations.  The  philosophy  of 
Project  Albert  was  to  pair  simple,  efficient,  abstract  models 
with  high  performance  computing  to  explore  large  design 
spaces.  When  these  models  and  high  performance  comput¬ 
ing  are  combined  with  efficient  experimental  designs  de¬ 
veloped  in  work  pioneered  at  NPS  (e.g.  see  Sanchez  and 
Lucas  2002),  a  huge  sample  space  can  be  explored  very  ra¬ 
pidly.  And  when  rapid  prototyping  capabilities  and  col- 
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laborative  environments  are  introduced  into  the  Data  Farm¬ 
ing  process,  progress  on  questions,  even  long-standing  and 
difficult  questions  involving  many  interacting  variables,  is 
possible. 

Project  Albert  used  what  are  referred  to  as  agent-based 
distillation  models.  These  are  a  type  of  computer  simula¬ 
tion  which  attempts  to  model  the  critical  factors  of  interest 
in  combat  without  explicitly  modeling  all  of  the  physical 
details.  Some  of  the  models  used  in  Project  Albert  were 
MANA,  PAX,  and  Pythagoras,  all  agent-based  models,  al¬ 
though  the  methods  developed  can  be  applied  using  any 
type  of  simulation  model.  These  models  continue  to  be 
developed  and  recent  updates  are  described  in  Lauren 
(2007),  Lampe  (2007),  and  Henscheid  (2007)  respectively. 
But  agent-based  models  are  small  and  abstract  and  can  eas¬ 
ily  be  run  many  times  to  test  a  variety  of  parameter  values 
and  get  an  idea  of  the  landscape  of  possibilities.  The  term 
distillation  is  added,  because  the  intent  is  to  distill  the 
question  at  hand  down  into  as  simple  a  representation  as 
possible.  Also,  models  used  in  Project  Albert  were  spe¬ 
cifically  developed  and  used  because  the  capability  to  rap¬ 
idly  prototype  scenarios  is  very  important  in  the  process. 

Although  Project  Albert  was  a  US  sponsored  effort,  it 
had  a  strong  spirit  of  international  collaboration  which 
made  possible  a  great  deal  of  cooperative  effort  among  re¬ 
searchers  around  the  world.  For  example,  the  three  models 
mentioned  in  the  paragraph  above  were  developed  in  New 
Zealand,  Germany,  and  the  US  respectively.  And  the  fact 
that  Project  Albert  was  question-based  also  allowed  practi¬ 
tioners  from  around  the  world  to  rally  around  the  develop¬ 
ing  Data  Farming  methodologies  because  of  the  impact 
upon  their  shared  application  interests. 

Because  many  of  the  questions  of  interest  have  wide 
applicability,  the  work  teams  at  international  workshops 
typically  consist  of  representatives  from  two  or  three  and 
sometimes  up  to  six  different  countries  and  the  workshops 
overall  have  usually  consisted  of  about  8  to  10  teams.  The 
first  international  workshop  was  organized  in  1999  and  12 
workshops  were  held  under  the  auspices  of  Project  Albert 
which  ended  in  September  2006.  Since  that  time  the  inter¬ 
national  Data  Farming  community  has  continued  coopera- 
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tive  efforts  and  the  most  recent  workshops,  the  16th  and 
17th  workshops,  were  held  in  2008.  Our  presentation  here 
will  highlight  these  most  recent  workshops,  but  details  on 
the  other  workshops  and  other  Data  Farming  efforts  can  be 
found  at  http://harvest.nps.edu,  the  website  of  the  SEED 
(Simulation  Experiments  and  Efficient  Designs)  Center  for 
Data  Farming  at  the  United  States  Naval  Postgraduate 
School. 

2  DATA  FARMING 

2.1  Overview 

Data  Farming  combines  the  rapid  prototyping  of  agent- 
based  distillations  with  the  exploratory  power  of  high  per¬ 
formance  computing  to  rapidly  generate  insight  into  mili¬ 
tary  questions.  Data  Farming  focuses  on  a  more  complete 
landscape  of  possible  system  responses,  rather  than  at¬ 
tempting  to  pinpoint  an  answer.  This  “big  picture”  solution 
landscape  is  an  invaluable  aid  to  the  decision  maker  in 
light  of  the  complex  nature  of  the  modem  battlespace.  And 
while  there  is  no  such  thing  as  an  optimal  decision  in  a  sys¬ 
tem  where  the  enemy  has  a  role,  Data  Farming  allows  the 
decision  maker  to  more  fully  understand  the  landscape  of 
possibilities  and  thereby  make  more  informed  decisions. 
Data  Farming  also  allows  for  the  discovery  of  outliers  that 
may  lead  to  findings  that  allow  decision  makers  to  no 
longer  be  surprised  by  surprise. 

The  simulations  that  defense  analysts  use  are  often 
large  and  complex.  Also,  even  the  smaller  more  abstract 
agent-based  distillations  referred  to  above  can  have  many 
parameters  that  are  potentially  significant  and  that  could 
take  on  many  values.  And  response  surfaces  can  be  highly 
non-linear.  Thus,  even  with  high  performance  computing 
and  the  small  models  used  in  Data  Farming,  gridded  de¬ 
signs  where  every  value  is  simulated  are  unwieldy. 

Thus,  using  efficient  experimental  designs  is  essential 
and  work  in  this  area  has  been  performed  at  the  Naval 
Postgraduate  School  (NPS)  in  Monterey,  California  and 
NPS  researchers  have  collaborated  with  others  worldwide 
as  well  (see  Kleijman,  Sanchez,  Lucas,  and  Cioppa  2005). 
Data  Farming  continues  to  evolve  from  initial  Project  Al¬ 
bert  efforts  (Floffman  and  Florne  1998)  to  the  work  docu¬ 
mented  in  the  latest  edition  of  the  Scythe  (Florne  and  Mey¬ 
er  2008b).  This  publication  contains  the  proceedings  of  the 
International  Data  Farming  Workshops  that  have  taken 
place  since  Project  Albert  ended  and  is  put  out  by  the 
SEED  Center  for  Data  Farming  at  NPS. 

2.2  Question-Based 

Over  the  past  few  years  several  articles  have  captured  the 
fundamentals  of  Data  Farming  (e.g.  Florne  and  Meyer  2005 
and  Lawler  2005),  but  the  key  is  the  question  at  hand.  At 
the  Naval  Postgraduate  School  over  60  theses,  many  by  in¬ 


ternational  students,  have  been  completed  which  have  used 
data  farming  over  the  past  decade  and  as  we  shall  present 
in  the  next  section,  over  100  international  work  teams  have 
formed  around  questions  at  International  Data  Farming 
Workshops. 

These  types  of  questions  can  never  have  precisely  de¬ 
fined  initial  conditions  and  a  complete  set  of  algorithms 
that  describe  the  system  being  considered.  These  questions 
address  open  systems  that  defy  prediction.  Data  Farming  is 
used  to  provide  insight  that  can  be  used  by  decision¬ 
makers.  To  accomplish  this  formidable  task,  Data  Farming 
relies  upon  two  basic  ideas: 

1.  use  high  performance  computing  (HPC)  to  execute 
models  many  times  over  varied  initial  conditions  to  gain 
understanding  of  the  possible  outliers,  trends,  and  distribu¬ 
tion  of  results,  and 

2.  develop  models,  called  distillations,  that  are  focused 
to  specifically  address  the  question. 

Data  Farming,  by  providing  the  ability  to  process  large 
parameter  spaces,  makes  possible  the  discovery  of  sur¬ 
prises  (both  positive  and  negative)  and  potential  options. 

2.3  Iterative  Process 

Data  Farming  is  an  iterative  team  process  (Horne  and 
Meyer  2004b).  Figure  1  presents  the  Data  Farming  process 
as  a  set  of  imbedded  loops.  This  process  normally  require 
input  and  participation  by  subject  matter  experts,  modelers, 
analysts,  and  decision-makers. 


Data  Farming  Loop 

Figure  1:  Data  Farming  Iterative  Process 


The  “Scenario  Creation”  loop  shown  on  the  left  side  of 
the  figure  involves  developing  and  honing  a  model  that 
adequately  represents  the  system  that  addresses  the  ques- 
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tion  being  asked  by  the  decision-maker.  This  is  an  iterative 
process  that  often  requires  honing  the  question  as  well. 

The  “Scenario  Run  Space  Execution”  loop  shown  in 
Figure  1  is  entered  once  the  basecase  of  the  scenario  is 
complete.  In  this  loop  the  team  defines  a  study  which  de¬ 
termines  which  scenario  input  parameters  should  be  exam¬ 
ined  and  what  processes  should  be  used  to  vary  them.  Here 
the  team  is  exploring  the  possible  variations  (or  excursions 
of  the  basecase)  in  the  initial  conditions  of  the  scenario. 
Specifically  those  parameters  that  address  the  question  be¬ 
ing  posed  are  considered. 

The  defined  study  is  used  to  guide  the  execution  of 
many  inns  of  the  model  in  the  HPC  environment.  Each  run 
produces  output  which  is  collected  by  the  Data  Farming 
system  and  provided  as  output  to  analysis  capabilities.  Af¬ 
ter  analysis  of  the  results,  the  team  (or  an  algorithm)  may 
decide  to  adjust  or  produce  a  new  study  or  adjust  the  model 
to  more  adequately  address  the  question.  This  process  con¬ 
tinues  until  insight  related  to  the  decision-maker’s  question 
has  been  gained. 

3  WORLD  WIDE  PARTNERS 

The  first  international  workshop  took  place  in  1999.  Some 
of  the  cities  and  towns  where  the  Project  Albert  and  Inter¬ 
national  Data  Farming  Workshops  took  place  are  world- 
reknowned  and  some  perhaps  are  not.  They  include  Kihei, 
Auckland,  Cairns,  Uberlingen,  Monterey,  Quantico,  Singa¬ 
pore,  Wellington,  Stockholm,  Honolulu,  Boppard,  Den 
Hague,  and  Garmisch  Partenkirchen.  But  the  actual  work¬ 
shops,  in  fact  celebrations  of  sorts,  are  merely  the  culmina¬ 
tion  of  a  great  deal  of  work  that  takes  place  in  between 
them.  Collaboration  and  contributions  to  the  overall  ad¬ 
vancement  of  Data  Farming  takes  place  in  the  development 
of  simulation  models,  scenarios  within  the  models,  and 
computer  clusters  to  ran  the  models  audacious  numbers  of 
times. 

But  the  real  work  is  in  making  progress  on  important 
questions  and  the  real  secret  is  the  combination  of  military 
subject  matter  experts  and  highly  knowledgeable  and  mul¬ 
ti-disciplinary  scientists.  This  special  mix  of  personnel  has 
been  the  hallmark  of  the  international  workshops.  It  has 
been  a  dynamic  combination  to  have  Data  Farming  work 
teams  headed  up  by  a  person  who  really  knows  and  cares 
about  the  question  (e.g.  a  military  officer  who  knows  that 
the  answers  may  have  an  impact  on  both  mission  success 
and  lowering  casualties)  and  supported  by  men  and  women 
with  technical  prowess  who  can  leverage  the  tools  avail¬ 
able. 

Countries  represented  throughout  the  decade  of  work¬ 
shops  include,  Australia,  Canada,  Germany,  Mexico,  New 
Zealand,  Norway,  Portugal,  Singapore,  South  Korea,, 
Sweden,  Turkey,  the  United  Kingdom,  and  the  United 
States.  But  each  nation  involved  in  Data  Farming  has  its 


own  story  regarding  their  contributions  and  what  they  have 
received  from  their  participation. 

Here  we  will  now  provide  some  details  regarding  how 
German  involvement  increased  from  their  participation  in 
the  early  workshops  when  the  methodologies  were  in  the 
beginning  stages  of  development.  After  Project  Albert 
started  the  emphasis  on  using  Data  Fanning  tools  within 
combat  situations  and  the  simulation  tools  were  developed 
to  represent  these  situations,  the  German  Delegation  car¬ 
ried  in  questions  regarding  peace  support  operations.  Hu¬ 
man  factors  modeling  and  the  influences  of  intangibles  are 
becoming  more  and  more  essential  in  this  question  area.  To 
simulate  the  non-attrition  based  parts  in  peace  support  op¬ 
erations  the  model  PAX  (after  the  Roman  goddess  of 
peace)  was  developed  in  Germany  and  released  to  the  In¬ 
ternational  Data  Farming  Community.  The  contributions 
led  to  a  broad  acceptance  of  Data  Farming  in  the  German 
modeling  and  simulation  community. 

The  Project  Albert  mission  was  clearly  to  develop  the 
methodology  of  Data  Farming  in  collaborative  environ¬ 
ments.  All  German  applications  had  the  clear  goal:  Not  to 
replace  the  classical  Modeling  and  Simulation  tools  by  new 
ones  but  to  apply  both  methods  in  an  "operational  synthe¬ 
sis"  (see  Brandstein  1999).  The  application  of  complex 
adaptive  systems  theory  with  the  modeling  following  the 
agent  based  paradigm  had  the  goal  to  explore  the  wide 
field  of  non  linearity,  of  co-evolution  and  intangibles.  Re¬ 
sults  were  a  continuum  of  solutions  in  the  sense  of  optimi¬ 
zation  theory  with  the  relevant  tools  for  a  statistical  ex¬ 
perimental  design  and  the  semi-automated  evaluation 
techniques  directing  the  user  to  unknown  effects,  or  “sur¬ 
prises”  and  interrelations  in  the  analysis  of  a  variety  of 
possible  progressions. 

The  international  community  drove  the  German  mis¬ 
sion  through  other  methodological  developments  such  as 
Nearly  Orthogonal  Latin  Hypercube  experimental  designs 
(e.g.  see  Cioppa  2002),  model  developments  (MANA,  Py¬ 
thagoras,  PAX,  etc.),  application  of  agent  based  develop¬ 
ment  environments  (NetLogo,  REPAST,  MASON,  etc.) 
and  through  evaluation  and  analysis  tools  of  many  types 
(e.g.  see  Upton  2004).  In  the  international  workshops  the 
availability  of  the  experts  and  the  free  and  open  informa¬ 
tion  sharing  led  to  a  big  success  in  Germany  in  the  applica¬ 
tion  of  the  tools  (Schweirz  2008). 

Other  Data  Farming  efforts  around  the  world  are  do¬ 
cumented  in  a  variety  of  places.  The  beginnings  of  devel¬ 
opment  in  the  United  States  is  documented  in  Maneuver 
Warfare  Science  1998  (Hoffman  and  Horne  1998).  And 
additional  volumes  of  Maneuver  Warfare  Science  from 
2001,  2002,  and  2003  contain  contributions  from  the  US  as 
well  as  Sweden,  New  Zealand,  Australia,  and  Singapore. 
Also,  the  book  It’s  Alive  (Meyer  and  Davis  2003)  contains 
a  chapter  describing  some  of  the  initial  USMC  efforts. 
Many  presentations  involving  Data  Farming  have  also 
been  made  at  INFORMS  meetings  (e.g.  Horne  and  Meyer 
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2004a)  and  MORS  Symposia  (e.g.  McDonald  2008)  over 
the  past  decade  and  Winter  Simulation  Sessions  on  Data 
Farming  were  held  in  2004,  2005,  and  this  year.  Finally, 
the  Scythe  is  a  regular  publication  from  the  SEED  Center 
for  Data  Farming  that  documents  workshop  proceedings. 

4  INTERNATIONAL  WORKSHOP  TOPIC 

SUMMARY 

Through  international  workshop  16  we  have  had  over  100 
work  teams  in  a  variety  of  areas.  Of  course  some  of  those 
work  teams  have  continued  from  workshop  to  workshop. 
For  example  the  combat  identification  team  which  has  had 
representation  over  the  years  from  6  different  countries 
started  at  workshop  12  and  continues  to  the  present.  These 
100+  work  teams  do  fall  into  areas,  or  themes,  which  in¬ 
clude:  Joint  and  Combined  Operations  (e.g.  C4ISR  Opera¬ 
tions,  Network  Centric  Warfare,  Networked  Fires,  and  Fu¬ 
ture  Combat  Missions),  Urban  Operations,  Combat 
Support  (e.g.  UAV  Operations,  Robotics,  Logistics,  and 
Combat  ID),  Peace  Support  Operations,  the  Global  War  on 
Terrorism,  Homeland  Defense,  and  Disaster  Relief.  Other 
work  teams  have  looked  into  continuous  support  of  model¬ 
ing  such  as  efficient  designs,  new  models,  model  im¬ 
provements,  automated  red  teaming,  and  automated  coevo¬ 
lution.  In  the  next  section  we  will  present  some  of  the 
specifics  of  recent  workshops. 

5  INTERNATIONAL  WORKSHOPS  IN  2008 

Two  Workshops  were  held  in  2008.  International  Data 
Farming  Workshop  (IDFW)  16  was  held  in  Monterey,  Cal¬ 
ifornia,  USA  from  13  to  18  April  and  IDFW  17  was  held  in 
Garmisch  Partenkirchen,  Germany  21-26  September.  The 
latter  occurred  too  late  to  be  include  in  this  summary,  but 
the  1 1  teams  from  IDFW  16  are  listed  below  to  give  a  fla¬ 
vor  of  the  breadth  of  topics  explored  at  an  IDFW  (Meyer 
and  Horne  2008b). 

Team  1  used  Pythagoras  to  explore  the  contribution  of 
small  unmanned  ground  vehicles  to  small  unit  combat  ef¬ 
fectiveness.  The  team  developed  a  building  clearing  sce¬ 
nario  and  examined  different  vehicle  capabilities  such  as 
speed,  sensor  range,  and  vulnerability. 

Team  2  built  on  the  work  of  a  completed  NPS  thesis 
which  examined  questions  regarding  the  new  Littoral 
Combat  Ship  using  MANA.  The  team  illustrated  the  power 
of  Data  Farming  by  conducting  over  40,000  replications  to 
help  understand  the  implications  of  a  variety  of  possible 
red  tactics. 

Team  3  was  an  internationally  co-led  team  which  used 
both  the  PAX  and  MANA  models  and  applied  Automated 
Red  Teaming  to  investigate  different  aspects  of  the  same 
problem  involving  peace  support  operations.  The  scenario 
used  in  this  team’s  Data  Farming  was  based  on  a  crowd 
control  situation  in  a  stabilization  operation. 


Team  4  used  the  opportunity  to  participate  in  IDFW 
16  to  begin  an  effort  using  the  agent-based  sensor  effector 
model  (ABSEM)  recently  developed  in  Germany.  They 
presented  the  main  ideas  of  the  ABSEM  to  learn  from  the 
available  expertise  and  they  plan  to  data  farm  a  prototype 
in  the  future. 

Team  5  used  the  Logistics  Battle  Command  model  and 
experimental  design  techniques  to  assess  the  impact  that 
Soldier  level  network  enabled  capabilities  have  on  cargo 
operations  at  a  truck  terminal  node  within  a  sustainment 
base  supporting  a  joint  force. 

Team  6  was  led  by  personnel  from  the  Joint  Test  and 
Evaluation  Methodology  program.  This  team  applied  de¬ 
sign  of  experiments  and  Data  Farming  using  MANA  for 
developing  evaluation  strategies  for  testing  in  a  joint  envi¬ 
ronment. 

Team  7  not  only  won  the  best  poster  competition,  but 
used  Data  Farming  to  explore  parameters  and  assumptions 
using  the  Total  Life  Cycle  Management- Assessment  tool 
on  a  Marine  Light  Armored  Vehicle. 

Team  8  conducted  Data  Farming  experiments  using 
their  agent  based  model  which  represents  situational 
awareness  and  the  cognitive  process  to  combine  new  sen¬ 
sor  input  with  it  to  make  identification  decisions. 

Team  9  used  Pythagoras  and  a  scenario  developed  for 
a  prototype  multi-agent  system  model  of  a  civilian  popula¬ 
tion  to  explore  the  response  of  the  civilian  population  to 
insurgent,  government,  and  stability  force  actions  in  a 
counterinsurgency  environment. 

Team  10  used  the  Joint  Dynamic  Allocation  of  Fires 
and  Sensors  (JDAFS)  model  which  is  being  reviewed  as  a 
tool  to  support  Joint  Starting  Condition  data  development. 
They  explored  a  joint  battlespace  scenario  in  a  Data  Farm¬ 
ing  environment  to  identify  possible  improvements  to 
JDAFS. 

And  finally,  Team  11  built  on  research  started  in  Can¬ 
ada  on  a  systems  dynamics  model  used  to  explore  the  use 
of  non-lethal  weapons  in  crowd  confrontation  situations. 
They  used  Data  Farming  and  design  of  experiments  ap¬ 
proaches  to  help  determine  the  most  sensitive  parameters 
and  develop  a  robust  set  of  rules  of  engagement 

6  INVITATION 

This  paper  has  two  purposes.  The  first  is  to  describe  the 
concept  of  Data  Farming  and  give  an  overview  of  how  it  is 
being  used  worldwide.  The  second  is  to  invite  you  to  be¬ 
come  part  of  our  International  Data  Farming  Community. 
We  value  openness,  collaboration,  and  having  fun  in  the 
process.  By  planting  seeds  of  knowledge  throughout  the 
world  we  feel  that  we  can  grow  the  methods  and  tools  to 
begin  to  provide  answers  to  the  difficult  questions  of  our 
age.  We  invite  you  to  contact  us,  we  invite  you  to  use  our 
tools  and  methods,  and  we  invite  you  to  join  us  in  person 
at  our  next  International  Data  Farming  Workshop  in  Mon- 
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terey,  California,  USA  from  22  through  27  March  2009. 
There  we  will  continue  to  strive  to  outline  the  landscapes 
of  possibilities,  discover  surprises,  and  uncover  those  dy¬ 
namic  truths  central  to  understanding  questions  that  we 
share. 
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